Bioinformatics A Practical Guide to Next Generation Sequencing Data Analysis (Hamid D. Ismail)

204 ◾ Bioinformatics

qlfq<-glmTreat(fitq,contrast=my.contrasts, lfc=2)

keg <- kegga(qlfq, species=”Hs”)

keg20<- topKEGG(keg, sort=”up”, n=20)

write.csv(keg20,file=”keg20.csv”)

5.3.8 Visualizing RNA-Seq Data

We have already discussed some methods for visualizing the RNA-Seq data. For publica-

tion purpose, we can create high-resolution colored graphics using “vidger” Bioconductor

package, which is meant to generate information-rich visualizations for the interpretation

of differential gene expression results from edgeR, DESeq2, and cuffdiff [39]. The “vidger”

package can be installed in R using the following:

if (!require(“BiocManager”, quietly = TRUE))

install.packages(“BiocManager”)

BiocManager::install(“vidger”)

Once the package has been installed, we can load it using:

library(“vidger”)

In the following, we will use “vidger” package to create plots to visualize the example RNA-

Seq data. Open R and make the “features” directory where you saved the RNA-Seq count

data file as your working directory. The vidger functions require DGEList object with

group and normalized count data. The following script will create a DGEList, “yNorm”

that can be used as input for the functions:

#Loading packages

library(edgeR)

library(“vidger”)

library(org.Hs.eg.db)

#Loading data

seqdata <- read.delim(“htcount2.txt”, stringsAsFactors=FALSE)

sampleinfo <- read.delim(“sampleinfo.txt”, stringsAsFactors=FALSE)

#Preparing data

countdata0 <- seqdata[,-(1:2)]

rownames(countdata0) <- seqdata[,1]

colnames(countdata0) <- sampleinfo$sampleid

countdata <- countdata0[rowSums(countdata0[])>0,]

group = factor(sampleinfo$condition)

#Creating DGEList object

y <- DGEList(countdata, group=group)

#Adding annotation

ENTREZID <- mapIds(org.Hs.eg.db,rownames(y),

keytype=”SYMBOL”,column=”ENTREZID”)

rownames(y$counts) <- ENTREZID

ann<-select(org.Hs.eg.db,keys=rownames(y$counts),

columns=c(“ENTREZID”,”SYMBOL”,”GENENAME”))

y$genes <- ann